Overview

Dataset statistics

Number of variables10
Number of observations889
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory62.6 KiB
Average record size in memory72.1 B

Variable types

Numeric6
Categorical4

Alerts

df_index is highly correlated with PassengerIdHigh correlation
PassengerId is highly correlated with df_indexHigh correlation
Survived is highly correlated with SexHigh correlation
Pclass is highly correlated with FareHigh correlation
Sex is highly correlated with SurvivedHigh correlation
Fare is highly correlated with PclassHigh correlation
df_index is highly correlated with PassengerIdHigh correlation
PassengerId is highly correlated with df_indexHigh correlation
Survived is highly correlated with SexHigh correlation
Pclass is highly correlated with FareHigh correlation
Sex is highly correlated with SurvivedHigh correlation
Fare is highly correlated with PclassHigh correlation
df_index is highly correlated with PassengerIdHigh correlation
PassengerId is highly correlated with df_indexHigh correlation
Survived is highly correlated with SexHigh correlation
Pclass is highly correlated with FareHigh correlation
Sex is highly correlated with SurvivedHigh correlation
Fare is highly correlated with PclassHigh correlation
Sex is highly correlated with SurvivedHigh correlation
Survived is highly correlated with SexHigh correlation
df_index is highly correlated with PassengerIdHigh correlation
PassengerId is highly correlated with df_indexHigh correlation
Survived is highly correlated with SexHigh correlation
Pclass is highly correlated with Fare and 1 other fieldsHigh correlation
Sex is highly correlated with SurvivedHigh correlation
SibSp is highly correlated with Parch and 1 other fieldsHigh correlation
Parch is highly correlated with SibSpHigh correlation
Fare is highly correlated with Pclass and 1 other fieldsHigh correlation
Embarked is highly correlated with PclassHigh correlation
df_index is uniformly distributed Uniform
PassengerId is uniformly distributed Uniform
df_index has unique values Unique
PassengerId has unique values Unique
SibSp has 606 (68.2%) zeros Zeros
Parch has 676 (76.0%) zeros Zeros
Fare has 15 (1.7%) zeros Zeros

Reproduction

Analysis started2021-12-13 14:54:21.125725
Analysis finished2021-12-13 14:54:35.915198
Duration14.79 seconds
Software versionpandas-profiling v3.1.0
Download configurationconfig.json

Variables

df_index
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
UNIFORM
UNIQUE

Distinct889
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean445
Minimum0
Maximum890
Zeros1
Zeros (%)0.1%
Negative0
Negative (%)0.0%
Memory size7.1 KiB
2021-12-13T15:54:36.115198image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile44.4
Q1223
median445
Q3667
95-th percentile845.6
Maximum890
Range890
Interquartile range (IQR)444

Descriptive statistics

Standard deviation256.9981728
Coefficient of variation (CV)0.5775239838
Kurtosis-1.197156422
Mean445
Median Absolute Deviation (MAD)222
Skewness0
Sum395605
Variance66048.06081
MonotonicityStrictly increasing
2021-12-13T15:54:36.573198image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
01
 
0.1%
5981
 
0.1%
5871
 
0.1%
5881
 
0.1%
5891
 
0.1%
5901
 
0.1%
5911
 
0.1%
5921
 
0.1%
5931
 
0.1%
5941
 
0.1%
Other values (879)879
98.9%
ValueCountFrequency (%)
01
0.1%
11
0.1%
21
0.1%
31
0.1%
41
0.1%
51
0.1%
61
0.1%
71
0.1%
81
0.1%
91
0.1%
ValueCountFrequency (%)
8901
0.1%
8891
0.1%
8881
0.1%
8871
0.1%
8861
0.1%
8851
0.1%
8841
0.1%
8831
0.1%
8821
0.1%
8811
0.1%

PassengerId
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
UNIFORM
UNIQUE

Distinct889
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean446
Minimum1
Maximum891
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size7.1 KiB
2021-12-13T15:54:36.843324image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile45.4
Q1224
median446
Q3668
95-th percentile846.6
Maximum891
Range890
Interquartile range (IQR)444

Descriptive statistics

Standard deviation256.9981728
Coefficient of variation (CV)0.5762290869
Kurtosis-1.197156422
Mean446
Median Absolute Deviation (MAD)222
Skewness0
Sum396494
Variance66048.06081
MonotonicityStrictly increasing
2021-12-13T15:54:37.096932image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
11
 
0.1%
5991
 
0.1%
5881
 
0.1%
5891
 
0.1%
5901
 
0.1%
5911
 
0.1%
5921
 
0.1%
5931
 
0.1%
5941
 
0.1%
5951
 
0.1%
Other values (879)879
98.9%
ValueCountFrequency (%)
11
0.1%
21
0.1%
31
0.1%
41
0.1%
51
0.1%
61
0.1%
71
0.1%
81
0.1%
91
0.1%
101
0.1%
ValueCountFrequency (%)
8911
0.1%
8901
0.1%
8891
0.1%
8881
0.1%
8871
0.1%
8861
0.1%
8851
0.1%
8841
0.1%
8831
0.1%
8821
0.1%

Survived
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size50.5 KiB
0
549 
1
340 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row1
3rd row1
4th row1
5th row0

Common Values

ValueCountFrequency (%)
0549
61.8%
1340
38.2%

Length

2021-12-13T15:54:37.322226image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-12-13T15:54:37.437224image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
0549
61.8%
1340
38.2%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Pclass
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct3
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size50.5 KiB
3
491 
1
214 
2
184 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row3
2nd row1
3rd row3
4th row1
5th row3

Common Values

ValueCountFrequency (%)
3491
55.2%
1214
24.1%
2184
 
20.7%

Length

2021-12-13T15:54:37.548230image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-12-13T15:54:37.654224image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
3491
55.2%
1214
24.1%
2184
 
20.7%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Sex
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size50.5 KiB
1
577 
0
312 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row0
3rd row0
4th row0
5th row1

Common Values

ValueCountFrequency (%)
1577
64.9%
0312
35.1%

Length

2021-12-13T15:54:37.790224image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-12-13T15:54:37.928225image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
1577
64.9%
0312
35.1%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Age
Real number (ℝ≥0)

Distinct88
Distinct (%)9.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean29.31515186
Minimum0.42
Maximum80
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size7.1 KiB
2021-12-13T15:54:38.116231image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0.42
5-th percentile6
Q122
median28
Q335
95-th percentile54
Maximum80
Range79.58
Interquartile range (IQR)13

Descriptive statistics

Standard deviation12.98493229
Coefficient of variation (CV)0.4429426925
Kurtosis1.007819813
Mean29.31515186
Median Absolute Deviation (MAD)6
Skewness0.5080100783
Sum26061.17
Variance168.6084667
MonotonicityNot monotonic
2021-12-13T15:54:38.412228image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
28202
22.7%
2430
 
3.4%
2227
 
3.0%
1826
 
2.9%
1925
 
2.8%
3025
 
2.8%
2124
 
2.7%
2523
 
2.6%
3622
 
2.5%
2920
 
2.2%
Other values (78)465
52.3%
ValueCountFrequency (%)
0.421
 
0.1%
0.671
 
0.1%
0.752
 
0.2%
0.832
 
0.2%
0.921
 
0.1%
17
0.8%
210
1.1%
36
0.7%
410
1.1%
54
 
0.4%
ValueCountFrequency (%)
801
 
0.1%
741
 
0.1%
712
0.2%
70.51
 
0.1%
702
0.2%
661
 
0.1%
653
0.3%
642
0.2%
632
0.2%
623
0.3%

SibSp
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct7
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.5241844769
Minimum0
Maximum8
Zeros606
Zeros (%)68.2%
Negative0
Negative (%)0.0%
Memory size7.1 KiB
2021-12-13T15:54:38.629151image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q31
95-th percentile3
Maximum8
Range8
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.103704876
Coefficient of variation (CV)2.105565739
Kurtosis17.83897238
Mean0.5241844769
Median Absolute Deviation (MAD)0
Skewness3.691057631
Sum466
Variance1.218164452
MonotonicityNot monotonic
2021-12-13T15:54:38.837132image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
0606
68.2%
1209
 
23.5%
228
 
3.1%
418
 
2.0%
316
 
1.8%
87
 
0.8%
55
 
0.6%
ValueCountFrequency (%)
0606
68.2%
1209
 
23.5%
228
 
3.1%
316
 
1.8%
418
 
2.0%
55
 
0.6%
87
 
0.8%
ValueCountFrequency (%)
87
 
0.8%
55
 
0.6%
418
 
2.0%
316
 
1.8%
228
 
3.1%
1209
 
23.5%
0606
68.2%

Parch
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct7
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.3824521935
Minimum0
Maximum6
Zeros676
Zeros (%)76.0%
Negative0
Negative (%)0.0%
Memory size7.1 KiB
2021-12-13T15:54:39.025132image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile2
Maximum6
Range6
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.8067607445
Coefficient of variation (CV)2.109442064
Kurtosis9.750591706
Mean0.3824521935
Median Absolute Deviation (MAD)0
Skewness2.745160126
Sum340
Variance0.6508628989
MonotonicityNot monotonic
2021-12-13T15:54:39.186139image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
0676
76.0%
1118
 
13.3%
280
 
9.0%
55
 
0.6%
35
 
0.6%
44
 
0.4%
61
 
0.1%
ValueCountFrequency (%)
0676
76.0%
1118
 
13.3%
280
 
9.0%
35
 
0.6%
44
 
0.4%
55
 
0.6%
61
 
0.1%
ValueCountFrequency (%)
61
 
0.1%
55
 
0.6%
44
 
0.4%
35
 
0.6%
280
 
9.0%
1118
 
13.3%
0676
76.0%

Fare
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct247
Distinct (%)27.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean32.09668088
Minimum0
Maximum512.3292
Zeros15
Zeros (%)1.7%
Negative0
Negative (%)0.0%
Memory size7.1 KiB
2021-12-13T15:54:39.482141image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile7.225
Q17.8958
median14.4542
Q331
95-th percentile112.31832
Maximum512.3292
Range512.3292
Interquartile range (IQR)23.1042

Descriptive statistics

Standard deviation49.69750432
Coefficient of variation (CV)1.548368958
Kurtosis33.50847727
Mean32.09668088
Median Absolute Deviation (MAD)6.9042
Skewness4.801440211
Sum28533.9493
Variance2469.841935
MonotonicityNot monotonic
2021-12-13T15:54:39.819138image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
8.0543
 
4.8%
1342
 
4.7%
7.895838
 
4.3%
7.7534
 
3.8%
2631
 
3.5%
10.524
 
2.7%
7.92518
 
2.0%
7.77516
 
1.8%
7.229215
 
1.7%
26.5515
 
1.7%
Other values (237)613
69.0%
ValueCountFrequency (%)
015
1.7%
4.01251
 
0.1%
51
 
0.1%
6.23751
 
0.1%
6.43751
 
0.1%
6.451
 
0.1%
6.49582
 
0.2%
6.752
 
0.2%
6.85831
 
0.1%
6.951
 
0.1%
ValueCountFrequency (%)
512.32923
0.3%
2634
0.4%
262.3752
0.2%
247.52082
0.2%
227.5254
0.4%
221.77921
 
0.1%
211.51
 
0.1%
211.33753
0.3%
164.86672
0.2%
153.46253
0.3%

Embarked
Categorical

HIGH CORRELATION

Distinct3
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size50.5 KiB
2
644 
0
168 
1
77 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2
2nd row0
3rd row2
4th row2
5th row2

Common Values

ValueCountFrequency (%)
2644
72.4%
0168
 
18.9%
177
 
8.7%

Length

2021-12-13T15:54:40.155138image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-12-13T15:54:40.301151image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
2644
72.4%
0168
 
18.9%
177
 
8.7%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Interactions

2021-12-13T15:54:33.615253image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-13T15:54:25.933726image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-13T15:54:27.355106image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-13T15:54:28.820164image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-13T15:54:30.412256image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-13T15:54:32.005279image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-13T15:54:33.867255image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-13T15:54:26.192108image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-13T15:54:27.572109image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-13T15:54:29.064165image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-13T15:54:30.648253image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-13T15:54:32.243125image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-13T15:54:34.100258image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-13T15:54:26.418160image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-13T15:54:27.807097image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-13T15:54:29.285255image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-13T15:54:30.906253image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-13T15:54:32.462259image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-13T15:54:34.355253image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-13T15:54:26.649107image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-13T15:54:28.075176image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-13T15:54:29.555253image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-13T15:54:31.172254image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-13T15:54:32.731253image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-13T15:54:34.593253image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-13T15:54:26.884108image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-13T15:54:28.334165image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-13T15:54:29.947261image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-13T15:54:31.511254image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-13T15:54:33.029256image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-13T15:54:34.857271image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-13T15:54:27.097108image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-13T15:54:28.590166image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-13T15:54:30.168253image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-13T15:54:31.767279image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-12-13T15:54:33.306254image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Correlations

2021-12-13T15:54:40.508142image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2021-12-13T15:54:40.928777image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2021-12-13T15:54:41.463636image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2021-12-13T15:54:41.769634image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.
2021-12-13T15:54:42.017642image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2021-12-13T15:54:35.314205image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
A simple visualization of nullity by column.
2021-12-13T15:54:35.753198image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

df_indexPassengerIdSurvivedPclassSexAgeSibSpParchFareEmbarked
00103122.000107.2502
11211038.0001071.2830
22313026.000007.9252
33411035.0001053.1002
44503135.000008.0502
55603128.000008.4581
66701154.0000051.8622
7780312.0003121.0752
88913027.0000211.1332
991012014.0001030.0710

Last rows

df_indexPassengerIdSurvivedPclassSexAgeSibSpParchFareEmbarked
87988188203133.000007.8962
88088288303022.0000010.5172
88188388402128.0000010.5002
88288488503125.000007.0502
88388588603039.0000529.1251
88488688702127.0000013.0002
88588788811019.0000030.0002
88688888903028.0001223.4502
88788989011126.0000030.0000
88889089103132.000007.7501